Clustering Time Series Based on Forecast Distributions Using Kullback-Leibler Divergence
نویسندگان
چکیده
One of the key tasks in time series data mining is to cluster time series. However, traditional clustering methods focus on the similarity of time series patterns in past time periods. In many cases such as retail sales, we would prefer to cluster based on the future forecast values. In this paper, we show an approach to cluster forecasts or forecast time series patterns based on the Kullback-Leibler divergences among the forecast densities. We use the same normality assumption for error terms as used in the calculation of forecast confidence intervals from the forecast model. So the method does not require any additional computation to obtain the forecast densities for the Kullback-Leibler divergences. This makes our approach suitable for mining very large sets of time series. A simulation study and two real data sets are used to evaluate and illustrate our method. It is shown that using the Kullback-Leibler divergence results in better clustering when there is a degree of uncertainty in the forecasts.
منابع مشابه
Clustering Symbolic Time-Series using L-tuples
Among the many dimensionality reduction methods for timeseries data, Symbolic Aggregate approXimation (SAX) is perhaps the most popular due to its simplicity and uniqueness. With SAX, time-series data can be represented as string sequences which enables the utilization of methods found in text mining and bioinformatics to enhance data mining tasks. We propose an application of L-tuples to impro...
متن کاملA Comparison of Optimal Operation of a Residential Fuel Cell Co-Generation System Using Clustered Demand Patterns Based on Kullback-Leibler Divergence
When evaluating residential energy systems like co-generation systems, hot water and electricity demand profiles are critical. In this paper, the authors aim to extract basic time-series demand patterns from two kinds of measured demand (electricity and domestic hot water), and also aim to reveal effective demand patterns for primary energy saving. Time-series demand data are categorized with a...
متن کاملModel Confidence Set Based on Kullback-Leibler Divergence Distance
Consider the problem of estimating true density, h(.) based upon a random sample X1,…, Xn. In general, h(.)is approximated using an appropriate in some sense, see below) model fƟ(x). This article using Vuong's (1989) test along with a collection of k(> 2) non-nested models constructs a set of appropriate models, say model confidence set, for unknown model h(.).Application of such confide...
متن کاملKullback-Leibler Divergence Measurement for Clustering Based On Probability Distribution Similarity
Clustering on Distribution measurement is an essential task in mining methodology. The previous methods extend traditional partitioning based clustering methods like k-means and density based clustering methods like DBSCAN rely on geometric measurements between objects. The probability distributions have not been considered in measuring distance similarity between objects. In this paper, object...
متن کاملComparison of Kullback-Leibler, Hellinger and LINEX with Quadratic Loss Function in Bayesian Dynamic Linear Models: Forecasting of Real Price of Oil
In this paper we intend to examine the application of Kullback-Leibler, Hellinger and LINEX loss function in Dynamic Linear Model using the real price of oil for 106 years of data from 1913 to 2018 concerning the asymmetric problem in filtering and forecasting. We use DLM form of the basic Hoteling Model under Quadratic loss function, Kullback-Leibler, Hellinger and LINEX trying to address the ...
متن کامل